Check if a string fits a given regular expression pattern.
StringRegExp ( "test", "pattern" [, flag ] [, offset ] ] )
Parameters
test | The string to check |
pattern | The regular expression to compare. |
flag | [optional] A number to indicate how the function behaves. See below for details. The default is 0. |
offset | [optional] The string position to start the match (starts at 1) The default is 1. |
Flag | Values |
0 | Returns 1 (matched) or 0 (no match) |
1 | Return array of matches. |
2 | Return array of matches including the full match (Perl / PHP style). |
3 | Return array of global matches. |
4 | Return an array of arrays containing global matches including the full match (Perl / PHP style). |
Return Value
Flag = 0 :@Error | Meaning |
2 | Bad pattern. @Extended = offset of error in pattern. |
@Error | Meaning |
0 | Array is valid. Check @Extended for next offset |
1 | Array is invalid. No matches. |
2 | Bad pattern, array is invalid. @Extended = offset of error in pattern. |
@Error | Meaning |
0 | Array is valid. |
1 | Array is invalid. No matches. |
2 | Bad pattern, array is invalid. @Extended = offset of error in pattern. |
Remarks
Regular expression notation is a compact way of specifying a pattern for strings that can be searched. Regular expressions are character strings in which plain text characters indicate what text should exist in the target string, and a some characters are given special meanings to indicate what variability is allowed in the target string. AutoIt regular expressions are normally case-sensitive.[ ... ] | Match any character in the set. e.g. [aeiou] matches any lower-case vowel. A contiguous set can be defined using a dash between the starting and ending characters. e.g. [a-z] matches any lower case character. To include a dash (-) in a set, use it as the first or last character of the set. To include a closing bracket in a set, use it as the first character of the set. e.g. [][] will match either [ or ]. Note that special characters do not retain their special meanings inside a set, with the exception of \\, \^, \-,\[ and \] match the escaped character inside a set. |
[^ ... ] | Match any character not in the set. e.g. [^0-9] matches any non-digit. To include a caret (^) in a set, put it after the beginning of the set or escape it (\^). |
[:class:] | Match a character in the given class of characters. Valid classes are: alpha (any alphabetic character), alnum (any alphanumeric character), lower (any lower-case letter), upper (any upper-case letter), digit (any decimal digit 0-9), xdigit (any hexadecimal digit, 0-9, A-F, a-f), space (any whitespace character), blank (only a space or tab), print (any printable character), graph (any printable character except spaces), cntrl (any control character [ascii 127 or <32]) or punct (any punctuation character). So [0-9] is equivalent to [[:digit:]]. |
[^:class:] | Match any character not in the class, but only if the first character. |
( ... ) | Group. The elements in the group are treated in order and can be repeated together. e.g. (ab)+ will match "ab" or "abab", but not "aba". A group will also store the text matched for use in back-references and in the array returned by the function, depending on flag value. |
(?i) | Case-insensitivity flag. This does not operate as a group. It tells the regular expression engine to do case-insensitive matching from that point on. |
(?-i) | (default) Case-sensitivity flag. This does not operate as a group. It tells the regular expression engine to do case-sensitive matching from that point on. |
(?i ... ) | Case-insensitive group. Behaves just like a normal group, but performs case-insensitive matches within the group. |
(?-i ... ) | Case-sensitive group. Behaves just like a normal group, but performs case-sensitive matches within the group. Primarily for use after (-i) flag or inside a case-insensitive group. |
(?: ... ) | Non-capturing group. Behaves just like a normal group, but does not record the matching characters in the array nor can the matched text be used for back-referencing. |
(?i: ... ) | Case-insensitive non-capturing group. Behaves just like a non-capturing group, but performs case-insensitive matches within the group. |
(?-i: ... ) | Case-sensitive non-capturing group. Behaves just like a non-capturing group, but performs case-sensitive matches within the group. |
(?m) | ^ and $ match newlines within data. |
(?s) | . matches anything including newline. (by default "." don't match newline) |
(?x) | Ignore whitespace and # comments. |
(?U) | Invert greediness of quantifiers. |
. | Match any single character (except newline). |
| | Or. The expression on one side or the other can be matched. |
\ | Escape a special character (have it match the actual character) or introduce a special character type (see below). |
\\ | Match an actual backslash (\). |
\a | Alarm, that is, the BEL character (chr(7)). |
\A | Match only at beginning of string. |
\b | Matches at a word boundary. |
\B | Matches when not at a word boundary. |
\c | Match a control character, based on the next character. For example, \cM matches ctrl-M. |
\d | Match any digit (0-9). |
\D | Match any non-digit. |
\e | Match an escape character (chr(27)). |
\E | end case modification. |
\f | Match an formfeed character (chr(12)). |
\h | any horizontal whitespace character. |
\H | any character that is not a horizontal whitespace character. |
\l | Match lowercase next char. |
\L | Match lowercase till \E. |
\n | Match a linefeed (@LF, chr(10)). |
\Q | quote (disable) pattern metacharacters till \E. |
\r | Match a carriage return (@CR, chr(13)). |
\s | Match any whitespace character: Chr(9) through Chr(13) which are Horizontal Tab, Line Feed, Vertical Tab, Form Feed, and Carriage Return, and the standard space ( Chr(32) ). |
\S | Match any non-whitespace character. |
\t | Match a tab character (chr(9)). |
\u | Match uppercase next char. |
\U | Match uppercase till \E. |
\v | any vertical whitespace character. |
\V | any character that is not a vertical whitespace character |
. | |
\w | Match any "word" character: a-z, A-Z or underscore (_). |
\W | Match any non-word character. |
\### | Match the ascii character whose code is given or back-reference. Can be up to 3 octal digits. Match back-reference if found. Match the prior group number given exactly. For example, ([:alpha:])\1 would match a double letter. |
\x## | Match the ascii character whose code is given in hexadecimal. Can be up to 2 digits. |
\z | Match only at end of string. |
\Z | Match only at end of string, or before newline at the end. |
{x} | Repeat the previous character, set or group exactly x times. |
{x,} | Repeat the previous character, set or group at least x times. |
{0,x} | Repeat the previous character, set or group at most x times. |
{x, y} | Repeat the previous character, set or group between x and y times, inclusive. |
* | Repeat the previous character, set or group 0 or more times. Equivalent to {0,} |
+ | Repeat the previous character, set or group 1 or more times. Equivalent to {1,} |
? | The previous character, set or group may or may not appear. Equivalent to {0, 1} |
? (after a repeating character) | Find the smallest match instead of the largest. |
[:alnum:] | letters and digits |
[:alpha:] | letters |
[:ascii:] | character codes 0 - 127 |
[:blank:] | space or tab only |
[:cntrl:] | control characters |
[:digit:] | decimal digits (same as \d) |
[:graph:] | printing characters, excluding space |
[:lower:] | lower case letters |
[:print:] | printing characters, including space |
[:punct:] | printing characters, excluding letters and digits |
[:space:] | white space (not quite the same as \s, it include VT: chr(11) ) |
[:upper:] | upper case letters |
[:word:] | "word" characters (same as \w) |
[:xdigit:] | hexadecimal digits |
Related
StringInStr, StringRegExpReplace
Example
;Option 1, using offset
$nOffset = 1
While 1
$array = StringRegExp('<test>a</test> <test>b</test> <test>c</Test>', '<(?i)test>(.*?)</(?i)test>', 1, $nOffset)
If @error = 0 Then
$nOffset = @extended
Else
ExitLoop
EndIf
for $i = 0 to UBound($array) - 1
msgbox(0, "RegExp Test with Option 1 - " & $i, $array[$i])
Next
WEnd
;Option 2, single return, php/preg_match() style
$array = StringRegExp('<test>a</test> <test>b</test> <test>c</Test>', '<(?i)test>(.*?)</(?i)test>', 2)
for $i = 0 to UBound($array) - 1
msgbox(0, "RegExp Test with Option 2 - " & $i, $array[$i])
Next
;Option 3, global return, old AutoIt style
$array = StringRegExp('<test>a</test> <test>b</test> <test>c</Test>', '<(?i)test>(.*?)</(?i)test>', 3)
for $i = 0 to UBound($array) - 1
msgbox(0, "RegExp Test with Option 3 - " & $i, $array[$i])
Next
;Option 4, global return, php/preg_match_all() style
$array = StringRegExp('F1oF2oF3o', '(F.o)*?', 4)
for $i = 0 to UBound($array) - 1
$match = $array[$i]
for $j = 0 to UBound($match) - 1
msgbox(0, "cRegExp Test with Option 4 - " & $i & ',' & $j, $match[$j])
Next
Next